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Abstract 

This paper deals with a general class of observation-driven time series mod- 
els with a special focus on time series of counts. We provide conditions under 
which there exist strict-sense stationary and ergodic versions of such processes. 
The consistency of the maximum likelihood estimators is then derived for well- 
specified and misspecified models. 
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There has recently been a strong renewed interest in developing models for 
time series of counts which arise in a wide variety of applications: economics, 
finance, epidemiology, population dynamics... Among the models proposed so 
far, obser vation-driven models introd uced by \Co2 (|l98ll) plays an important 
role (see ( Kedem a n d Fokianosl l2002l Chapter 4) for a comprehensive account 
and Tiostheim ( 2012f ) for a recent survey). In time series of counts, the obser- 
vations are the realisations of some integer-valued distribution (e.g. Poisson, 
negative binomial, ...) depending on some parameters that drives the dynamic 
of the model. In this paper, we focus on the so-called observation-driven time 
series models in which the parameter depends solely the past observations. Ex- 
am ples of such niodels inc lude Poisson integer-valued GAR CH (ING ARCH) 
(see lFerland et all (120061) orlZhul (Eoij) iFokianos et al.l (|2009l )). Poisson thresh- 
old models (see iHenderson et al. ( 2011 )) log-linear Poisson autoregression (see 
iFokianos and Ti0stheim ( 2011 )): see also Davis et al. ( 2003 ) . Davis and Liu ( 2012l )| 
and iNeumannI (|201l[ ) for other observation-driven models for Poisson counts. 

This paper discuses the theory and inference for a general class of observation-l 
driven models which includes the models intro duced above as particula r exan i- 
ples. Compared to the approach introduced in lFokianos and Tiostheim (l201ll) . 
our argument is not based on the so-called perturbation technique. Recall that 
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this technique consists in two steps: in a first step, a perturbed version of the 
process is shown to be geometrically ergodic, in a second step, the perturbed 
process is shown to converge to the original one by letting the perturbation 
goes to 0. These two steps make it possible to develop a likeliho od theory 
on the perturbed process and then to take the limit. As argued by iDoukhaiil 
( 2012f ). this approximation technique might seem unnatural and is technically 
involved. In a ddition, it heavily relies on the Poisson assumption. The approach 
developed bv iNeumann ( 2011 ) is more direct but is based on a contraction 
assumption on the intensity of the Poisson variable which is not satisfied, for 
example neither in the log-linear Poisson autoregression model nor in the Pois- 
son threshold model. We do not fo llow the weak dependence approach which 
as outlined in iDoukhan et al.l (j2012l) also implies unnecessary Lipshitz assump- 
tions of the model a nd does not yield directly a theory f or likelihood inference. 
Those authors applv lOoukhan and Wintenberger ( 2008 ) results; the latter use 
a contraction argument also adapted to deal with more general infinite mem- 
ory models which essentially extends on assumptions (jl3p below relative to the 
current Markov case. Those authors also derived weak dependence conditions 
for such models; we should anyway quote that such Taylor-made dependence 
conditions do not allow as performing results as the present techniques. 

Our approach is based on the theory of Markov chains without irreducibil- 
ity assumpti on. We fir s t prov e the existence of a stationary distribution using 
the result of iTweedid (|l988[ ). The main difficulty when the Markov chain is 
not necessarily irreducible consists in proving the uniqueness of the stationary 
distribution. For that p urpose, we extend the delicate argument introduced by 
Henderson et al.l (l201lh and based on the theory of asymptotically strong Feller 



Markov chains (see Hairer and Mattinglyl ( 20061 )). Our extension introduces a 
drift term which adds considerable flexibility on the model assumptions and 
allow to cover the log-linear Poisson auto regression model under assump tions 
which are weaker than those reported in iFokianos and Tiostheim ( 2011 ) . We 
then establish ergodicity for the two-sided stationary version of the process 
under the sole assumption of existence and uniqueness of the stationary distri- 
bution. Finally, we develop the theory of likelihood inference by approximating 
the conditional likelihood by an appropriately defined stationary version of it, 
which is shown to converge using classical ergodic theory arguments. Our like- 
lihood inference theory covers both well-specified and misspecified models. We 
focus on the consistency of the conditional likelihood estimator but the asymp- 
totic normality can also be covered using stationary martingale arguments. Due 
to space constraints, this will be reported in a forthcoming paper. 

The organization of the paper is as follows. [Section l] formulates the model, 
establishes the existence and uniqueness of the invariant distribution and shows 
the ergodicity and existence of some moments for the observation process. The 
maximum likelihood estimates of the parameters and the relevant asymptotic 
theory are then derived in [Section 2l Examples of threshold autoregressive and 
log Poisson counts are used to illustrate our findings. The proofs are given in 
[Section 3] Finally, the Appendix contains general statements about the ergodic- 
ity of Markov chains under minimal assumptions which might be of independent 
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interest. 



1. Ergodicity of the Observation-driven time series model 

Let (X, d) be a locally compact, complete and separable metric space and 
denote by X the associated Borel sigma-field. Let (Y, 3^) be a measurable space, 
H a Markov kernel from (X, A") to (Y,y) and {x,y) i-^ fy{x) a measurable 
function from (X x Y, A" ® to (X, X). 

Definition 1. An observation-driven time series model on N is a stochastic 
process {(X„,y„) , n G N} on X x Y satisfying the following recursions: for all 

/c e N, 



Xk+l = fYk+A^k) , 



(1) 



where Tk = cr^Xg, Yi] £ < k,i g N). Similarly, {{Xn,Y„) , n Cz Z} is an 
observation-driven time series model on Z if the previous recursion holds for 
allkeZ with Tk ^ cr{Xi, Yi;e<kj€Z). 



Observation-driven time series m o dels have been int r oduced bvlCox 



and later conside red by Streett (|2000|) . Davis et al. ( 2003 ). Fokianos et al, 
NeumannI (|201ll) and lDoukhan et all (|2012l) . 



1981 ) 



20091), I 



In an observation-driven time series model, {l^}„gN are observed whereas 
{Xn}nGN are not observed. This model shares similarities with Hidden Markov 
Models, the main difference lying in the fact that given Xq and k successive 
observations Yq, . . . ,Yk, ^ allows to compute Xk- In the following, the notation 
Us:t stands for (us, . . . , ut) for s < t. 

Example 2. The GARCH(1,1) model defined by 

Yk+i\alk.Yo:k-M{{),al) , 



where min((i, a, 6) > can be written as in ^ by setting Xk — erf. and fy{x) = 
d-\- ax -\- hy"^ . 

Example 3. The Poisson threshold model defined by 
Yk+i\XQ,k,YQ,k ^'PiXk) , 

Xk+i =^ + aXk + bYk+i + {cXk + dYk+i)l{Yk+i i {L, U)} , 

where T'{X) is the Poisson distribution of parameter A and < L < U < oo can 
be written as in ([T]) by setting fy{x) — uj + ax + by + {cx -f dy)l{y ^ (L, U)}. 

Note that X„ being the parameter of a Poisson distribution, it should be 
nonnegative. It is therefore usually assumed that Xq > lj and min(aj, a, &, a -f 
c,b + d)>Q. 
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Exam ple 4. The log-linear Poisson autoregression model introduced bv lFokianos and Tj0stheirr^ 
20 lH ) and defined by 



Xk+1 ^ d + aXk + bln{l + Yk+i) , 

where V{X) is the Poisson distribution of parameter A can also be written as in 
(dl by setting fy(x) = d + ax + 61n(l + y). 

A natural question is to find conditions under which there exists a strict- 
sense stationary and ergodic version of the observation process {YfcjfcgN- Note 
For the GARCH(1,1) model as described in |Example 2[ this problem can be 
easily solved by exploiting known result s on random coefficient autoregressive 
processes; see for example Brandt ( 19861) . Bougerol and Picard ( 1992 ) and the 



references therein. 

Since {YfcjfcgN is not itself a Markov chain, a classical approach is to prove the 
existence of a strict-sense stationary ergodic process {yfc}fegN as a deterministic 
function of an ergodic Markov chain. To this aim, it is worthwhile to note that 
{((X„,y„), J"^'^) , n G N} is a Markov chain on (X x Y, A' (g> y) with respect 
to its natural filtration 

•^^'^ = i^k'^ , e N) , where Tf'^ = (T((Xfc, Ffc) , 1 < fc < ^, ^o) , 

and that {{Xm ^'^) , n G N} is also a Markov chain on (X, X) with respect to 
its natural filtration 

= ( , fc G N) , where = a{Xt , < £ < k) . 

Denote now by Q the Markov kernel associated to {Xk , fc G N} defined implicitly 
by the recursions ([T]). 

In this section, we derive general conditions expressed in terms of H and 
/ under which {Xk , A; G N} and {{Xk,Yk) , fc G N} admits a unique invariant 
probability distribution. This is a particularly tricky task when the observation 
process {Fn}„eN is integer-valued as in [Example 3| and |Example~4l In such case, 
the Markov chain {Xn}neN takes value on 

{fv.°---°fv.M : fceN, (yi,...,2/fc) gZ^-} , 

which is a countable subset of X. When starting from two different points xq and 
x'q, the values taken by {Xn}neN may belong to two disjoint countable subsets 
of X. In that case, the total variation distance between Q"{xo, •) and Q"{x'q, •) 
is always equal to 2 regardless the values of n G N and thus does not converge 
to 0. We therefore stress that the results obtained in the sequel do not assume 
that the Markov chain is irreducible. 

1.1. Coupling construction and main results 

The proof is based on a coupling construction on Markov chains which is 
now described. 
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Introduce a kernel H from (X^jA"®^) to (Y^,}^®^) satisfying the following 
conditions on the marginals: for all S and A & 

Hiix,x');AxY) ^ H{x,A), H{{x,x');Y x A) ^ H{x' , A) . (2) 

Let C e y'^'^ such that H{{x,x'); C) 7^ and consider the Markov chain {Zk = 
{Xk,X',^,Uk) , n e N} on the "extended" space (X^ x {0, 1},X®^ (g) Vi{0, 1})) 
with transition kernel Q implicitly defined as follows. Given Zk = {x,x',u) g 
X^ X {0, 1}, draw {Yk+i,Y^_^_-^^) according to H{{x,x'); ■) and set 

Xk+l = (x) , X^+i = fy^^^ {x') , Uk+i = lc(n+l, n'+i) , 

Zk+1 = iXk+i,Xf.+i,Uk+i) ■ 

The conditions on the marginals of H, given by ([2]) also imply conditions on the 
marginals of Q: for all A G X and z = {x, x' , it) £ X^ x {0, 1}, 

Q{z;AxXx {0,1})^ Q{x; A), Q{z;X x A x {0,1}) ^ Q{x'; A) . (3) 

For z — {x, x' , u) e X^ x {0, 1}, write 

a{x, x') = Q{z; Y? x {1}) = H{{x, x'); C) ^ . (4) 

The quantity a{x, x') is thus the probability of the event {Ui = 1} conditionally 
on Zo, taken on Zq = z. Denote by Q"^ the kernel on (X^, X^^) defined by: for 
ah z = {x,x',u) G X2 X {0,1} and A e A'®^ 

(3'((x.x'M)= g"'r';'' 

Q{z;X^ X {1}) 

so that using 

Q{z; A X {1}) ^ a {x, x') Q\{x, x');A) . (5) 

This shows that Q'^{{x,x')] ■) is the distribution of {Xi,X[) conditionally on 
{Xo,Xq, Ui) — {x,x',l). Consider the following assumptions; 

(Al) The Markov kernel Q is weak Feller. Moreover, there exist a compact set 
C eX, {b, e) G K+ X M+ and a function y : X ^ M+ such that 

QV <V-e + blc ■ (6) 

Following (jMevn and Tweedid . lT993l . Definition 6.1.2), a point G X is said 
to be reachable for the Markov kernel Q if for all a; G X and all open sets A 
containing xq, we have ^„ Q^{x, A) > 0. 

(A2) The Markov kernel Q has a reachable point. 

In what follows, if (E,f) a measurable space, ^ a probability distribution on 
(E, £) and R a Markov kernel on (E, £), we denote by ¥^ the probability induced 
on (E'*', f by a Markov chain with transition kernel R and initial distribution 
We denote by the associated expectation. 
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(A3) There exist a kernel Q on (X^ x {0,1}, A'^ (g) P({0, 1})), a kernel on 
(X^, A'®^) and a measurable function a : X^ -)■ {0, 1} satisfying ([3]) and 
a measurable function W : ^ [l,oo) and real numbers {DXi,C2, p) G 
^3 X (0, 1) such that for all [x, x') e X^, 



l-a{x,x') <d{x,x')W{x,x') (7) 

^sLs^, WXr.^ K)] < Dp-d{x, x') (8) 

^t^K> ['^(^«' K)W{X^, <)] < (a:, x')W^^ {x, x') (9) 
Moreover, for all a; G X, there exists 7a; > such that 

sup W{x,x') < oo . (10) 

Remark 5. The assumption (A[T]) implies by iTweedit . 198^ . Theorem 2) that 
the Markov kernel Q admits at least one stationary distribution. Assumptions 
(-A[5][3]) are then used to show that this stationary distribution is unique. 

Remark 6. The s e ass umptions weaken the Lipshitz conditions obtained by 
{Henderson et al\ . \20lM . eq (15)) by introducing a "drift" function W in ^ . 
This allows to treat for example the Log-linear Poisson autoregr ession under 
minim al assumptions. It thus answers to an open question raised by iHenderson et aZI ,| 



20 In . p. 816) on dealing with models which do not satisfy Lipshitz condition as 



expressed ^n ^Henderson et oA lMI. eq (15)) 

Remark 7. Eq ([5]) shows that we can simulate {Xi, X[, Ui) according to Q{{x, x' , u); •)! 
as follows. Toss a coin with probability of heads a{x,x'). If the coin lands head, 
then set Ui = 1 and draw {Xi,X[) ^ Q\{x,x');-). Otherwise, set C/i = and 
draw {Xi,X[) according to 

^ Qiix,x',u);Ax {0}) 
1 — a{x, x') 

Under ([8]) and ([9]), the stochastic processes 

{d{Xu,X'k) , fc e N} , and {d{Xu,X'k)W{Xk, X^) , fc G N} , 

conditionally on the fact that the coin lands heads repeatedly, goes geometrically 
fast to in expectation. When the coin lands tail, nothing is assumed about the 
behavior of these processes but we can bound the probability of this event by ([7]) . 

Theorem 8. Assume that (-A[T1[3|) hold. Then, the Markov kernel Q admits a 
unique invariant probability measure. 

Proof. The proof is postponed to lSection 31 □ 

Note that ITheorem 81 does not provide a rate of convergence to the station- 
ary distribution. Nevertheless, when discussing inference in these models, some 
moment conditions with respect to the stationary distribution are needed. The 
following Lemma allows to assess if a function / is integrable with respect to an 
invariant distribution of the Markov kernel Q. 
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Lemma 9. Assume that the Markov kernel Q admits an invariant kernel tt and 
that there exist a measurable function V : X and real numbers (A, (3) G 

(0, 1) X M+ such that QV < XV + p. Then, 

T^V < - A) < cx) . 

Proof. The proof is postponed to lSection 31 □ 

Proposition 10. Assume that the Markov kernel Q admits a unique invariant 
probability measure. Then, there exists a strict-sense stationary ergodic process 
on Z, {Yn}nez, solution to the recursion ([T]). 

Proof. Denote by tt the unique invariant distribution of the Markov kernel Q. 
Now, let {{Xn,Yn) ,n G N} be the Markov chain satisfying ([!]). If tt is an 
invariant distribution for {(X„,F„),n G N}, then the marginal distribution 
A 1-^ Tt{A xY) is a stationary distribution for the Markov kernel Q and since tt 
is unique, tt (A x Y) = 7r(A). If (Xq, Yq) ~ tt, then by ([T]), {Xi,Yi) is distributed 
according to S // 'rr{dx)H{x;dyi)lB{fyi{x),yi). Since tt is an invariant dis- 
tribution for {{Xn, Fn) , n G N}, we therefore obtain, 

n{B)^ jj ^{Ax)H{x-dyi)lB{fyAx),yi) . for aU B G A" ® (12) 

Thus, the Markov chain{(X„, y„) ,n G N} has a unique invariant distribution 
given by . By applying ITheorem 321 and ITheorem 33[ there exists a strict- 
sense stationary ergodic process on Z, {{Xn, Y^) , n G Z}, solution to the recur- 
sion ([T]). The proof follows. □ 

We end the section by providing some practical conditions for checking ([8]) 
and dH) in {P^. 

Lemma 11. Assume that either (jl]) or ([u]) or (pTiT) (defined below) holds, 
(i) There exists (p, /?) G (0, 1) x R such that for all {x,x') G 

d{X,,X[)<pd{x,x'), P£"^,^,-a.s. (13) 
Q^W <W + P (14) 

(a) © holds and W is bounded. 

(Hi) (0) holds and there exists < a < a' and (3 G such that for all 
{x,x')gX^ 

d{x,x') < W"{x,x') 

Then, © and (0) hold. 
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Remark 12. Erqo dicity under Lip shit z conditions have been studied in a wide 
literature including \Sunuacl\ \l97A) or lDiaconis and Freedman but the 

fact that the various contraction conditions in \Lemma 11\ are related to the ker- 
nel Q" and not to the kernels Q or Q make it possible to check the assumptions 
([5]) and © quite directly. 

Proof. See lSection 3] □ 
1.2. Examples 

1.2.1. A Poisson threshold model 

Existence and uniqueness of the station ary distribution for the P oisson thresh-| 
old model have been already discussed in [Henderson et al.l (|2011l ) . We can ob- 
tain the same results by applying [Theorem 81 provided that assumptions (AlT][3]) 
hold. Consider a Markov chain {Xn , n £ N} with a transition kernel Q given 
implicitly by the following recursive equations: 

yn+l\Xo;n,Yo-n ^V{Xn) , 

Xn+1 =UJ + aXn + hYn+i + (cX„ + dYn+l)l{Yn+l ^ (L, U)} , 

where < L < U < oo. Moreover, to keep the parameter of the Poisson dis- 
tribution positive, it is assumed that Xq > uj and min(a;, a,b, a + c,b + d) > 0. 
Here, we set X = d{x, x') = |a; — and 

fy{x) =uj + ax + by+ {cx + dy)l{y ^ (i, U)} . 

Lemma 13. Assume that aW {a + c) < 1, then holds. 

Proof. Define implicitly Q as the transition kernel Markov chain {Z„ , ri £ N} 
with Zn — {Xn,X'^, Un) in the following way. Given Z„ = {x,x' ,u), if a; < x' , 
draw independently V{x), Vn+i ^ V{x'—x) and set F^+i = Yn+i+Vn+i- 

Otherwise, draw independently K^+i ^ 'P[x') and Vn+i ^ V{x — x') and set 
Yn+i = + Vn+i- In all cases, set 

XU,^f'y,Jx'), 

Un+l - l{Yn+l = - l{Vn+l - 0} , 

Zn+1 ~ {Xn+\,X[^j^^,lJn+l) ■ 

Note again that if y '--^ ^(-^): ^ ~ ^(•^') ^^^d {Y,V) are independent, then 
Y + V ~ ■p(A + A'). This implies that Q satisfies the marginal conditions 
Define for aU x* = (x, a;') G M.'^, Q'^{xK •) as the law of iXi,X[) where 

Xi=fUx), X[^f^{x'), 

and Y - P{x A x'), and set, for aU a;" = {x, x') e M^, 

a(x'*) = exp { — |.T — .t'|} . 
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With these definitions, obviously, Q, a and satisfy ([S]). Moreover, using 
1 — e~" < u, we obtain 

1 — a(x^) = 1 — exp {—\x — x'\} < |a; — x'\ . 

so that ([7]) holds with W — l^s. To obtain dH) and ([9]), we apply iLemma lll bv 
checking (P in ILemma 111 

-X{\^\a + cl{Y, i {L, U)]\\x - x'\ <p\x-x'\} = l, (15) 

where p = aV(a + c) < 1. The function W being constant, (|i]) holds and the 
proof is completed. □ 

Proposition 14. Assume that {a^b'^c+d)\/a< 1, then the Markov kernel 
Q admits a unique stationary distribution tt. Moreover, ■nV < oo where V is the 
function V : R+ R+ defined by V{x) = x. 

Proof. According to ITheorem 81 and ILemma 131 it is enough to show (A[T][1]) 
to obtain the existence and unicity of an invariant probability measure tt. We 
start with (-A[T]). A random variable of distribution P{X) converges weakly to 
a random variable of distribution 'P(A') as A — A'. This implies by Slutsky's 
Lemma that if : M+ N, x N{x) is a Poisson process of unit intensity, 
then 

Xi{x) ^iu + ax + bN{x) + (ex + dN{x))l{N{x) ^ {L, U)} 

converges weakly to Xi{x') as x ^ x' . Therefore, Q is weakly Feller. Moreover, 
it can be readily checked that the nonnegative function V{x) = x {V is indeed 
nonnegative as a function defined on X = M+) satisfies: 

QV{x) = ia + b + cP[Nix) i (L, U)] + dE[Nix)lNi^,)(iL,u)]/x)Vix) + to . 

It can be easily checked that 

lim P[Ar(a;) ^ (L, t/)] = 1 , and lim E[A(x)lAr(^)^(i m]/a; = 1 , 

so that 

OVix) 

lim ^ = a + 6 + c + d < 1 , and sup QV{x) < oo , VA/ e M+ . 

x^oo V[X) 0<x<M 

These two properties imply that there exist (A, (3) e (0, 1) x R+ such that 

QV <XV + (3 . (16) 

Thus, the drift condition ^ holds. Thus, (A[T]) is satisfied. Set Xoo — (1— a— c) 
and let C be an open set containing x,^. Let x e M and define recursively the 
sequence xq = x and for all fc > 1, = cj + (a + c)xfc_i . Since (a + c) < 1, this 
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sequence has a unique limiting point, lini„_j.oo Xn — Xoo- Therefore, there exists 
some n such that for all k > n, Xk € C . For such n, we have 

Q"(a;, C) = P^^ (X„ e C) > (X„ G C, Yi = . . . = y„ = 0) 

= p^jFi = . . . = y„ = 0) > . 

so that (A[2]) holds. Moreover, since the function V{x) — x satisfies p^ . lLemma 9 | 
show that irV < oo. □ 

Remark 15. In the proof o ftLemma 13\. we check (-A[3]) by verifvina lLemma IIV 
In some models, avvlvina lLemma IIVE^ may provide more flexibility as can 
be seen in the following example: 

Xn+1 ^UJ + {a + clY„^-^=o)Xn + bYn+1 + rfl"n+l ly„ + i=o , 

This is a particular Poisson threshold model as defined in \Example 3\ with {L, U) =1 
(1/2, oo). It is assumed that Xq > uj and min(a;, a,b^a-\r c,b + d) > 0. Now, ac- 
cording to \Lemma (-Al3]) holds if a W (a + c) < 1 . We can now prove that 
(-Al3]) holds even if a + c > 1 provided that a V (a + ce"'^) < 1. To see this, we 
just adapt the proof of lLemma 13\ by replacing (jlSp by 

^t^s^, (1^1 {a + cl{Y, = 0})\x - x'\ < p\x - x'\ 

where p :— a+ce~" < 1. This implies that ([8|) holds so that condition l Lemma IIV 
(pH) holds and thus \Lemma 11\ concludes the proof. 

1.2.2. Log-linear Poisson autoregression 

Consider a Markov chain {Xn , n G N} with a transition kernel Q given 
implicitly by the following recursive equations: 

Yn+l\Xo:n,Yo:n - Pie""") , 

Xn+1 =d + aXn + 6 In (r„+i + 1) , (17) 

where V{X) is a Poisson distribution with parameter A. In this case, the state 
space is X = R which is equipped with the euclidean distance d{x, x') — \x — x'\ 
and the function fy is defined by: fy{x) = d + ax + 51n(l + y). 

Lemma 16. // |a + 6| V \a\ V \b\ < 1, then {J^ holds. 

Proof. Define implicitly Q as the transition kernel Markov chain {Z„ , n £ 
N} with Zn — {Xn, X'^,Un) iu the following way. Given Z„ — {x,x',u), if 
X < x', draw independently Yn+i ^ P{e^) and Vn+i ^ P{e^ — e^) and set 
Yn+i = Yn+i-\-Vn+i. Otherwise, draw independently Yn+i ~ P{e^ ) and Vn+i ~ 
P{e^ — ) and set Yn+i = Y/i+i + Ki+i- In all cases, set 

Xn+i = d + ax + bin {Y„+i + 1) , 

x;+i = rf + ax' + 6in(y,:+i + i) , 

Un+l = l{Yn+l = = l{Vn+l - 0} , 

Zn+1 — {Xn+1, X'n_^_i, Un+l) ■ 
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Along the same lines as above, Q satisfies the marginal conditions ([3]). Moreover, 
define for ah = {x,x') G X^, Qf(x», •) as the law of (^1,^0 where 

Xi=d + ax + b\n{Y + 1) , F - 7'(e="^^') , (18) 
X[ ^ d + ax +h\n{Y + 1) , 

and set for all a;' = (a;, a;') £ M? , 

=exp|-e^''^' +e^^^'} . 

With these definitions, obviously, Q, a and satisfy ([5]). Using twice 1 — e~" < 
u, we obtain 

1 - a{x^) = 1 - exp I -e^^^' + e^''^' } < e^""^' - e^^^' 
= e^''^'(l -e^l^-^'l) <W{x,x')\x~x'\ . 

with W{x^) = el^l^l^'l so that © holds. To check © and ®, we apply 
ILemma TTl bv checking ^ in lLemma 111 Note first that 

^t^5j\Xi-X[\ = \a\\x-x'\}^l, 

so that is satisfied. To check (|14l) . we will show that 

Q'^W{x,x') , , 

hm ^ ; \ ' = . 19 

\x\\j\x'\~^oo W(x,x') 

and for all M > 0, 

sup Q^W (x,x') <oo . (20) 

|x|v|2;'|<_A/ 

Without loss of generality, we assume that x < x' . Using (IT51) , we get 

0»W(x,x') =]E (el^il^l^il) <E(el^il)+E(el^JI) . (21) 
First consider the second term of the right-hand side of ([?T|) . 

E(el^JI) < el'^lE(el°^'+'''"(i+'*')l) . (22) 

Now, note that if u and v have different signs or if w = 0, then + < |m| V \v\. 
Otherwise, \u + v\ = {u + v)\{v > 0} V {—u — v)\{v < 0}. This implies that 

el^+^l < el"l + el"! + e"+''l{w > 0} + e-"-^l{w < 0} . 

Plugging this into ([22l) . 

E(el-^JI) < el'^l (el"^!!^'! + E[(l + Yf^] + e'^^'E[(l + Yf]l{b > 0} 

+e-''^'E[(l + y)-'']l{6< 0}) . 
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Note that for all 7 G [0, 1], 

E[(l + yy] < [E(l + Y)]'' = (1 + e^)'^ < 1 + e^^ < 1 + e''^' . 

Moreover, since |6| G [0, 1], we have bl{b > 0} £ [0, 1] and -bl{b < 0} £ [0, 1]. 
Therefore, 

E(el^il) < el''! (gl^H^'l + 1 + el^'H^I + e''^'(l + e''^')l{6 > 0} 
+e-"^'(l + e-^^')l{6< 0}) 
< el-^l (^el^ll^"'! + 1 + el''ll^l + el^H^^'l + gl^+^H^'l^ 
<eMI (i+4e^(l-lvk'l)) ^ 

where 7 = |a| V |5| V |a + 6| < 1. The first term of the right-hand side of (pTjl is 
treated as the second term by setting x' — x. We then have 



so that using (pij) . 



E(el-^il) < cl''l (^1 -|-4e''(l^l''l="'l)) , 

fW{x, x') < 2el''l (1 + 4e^(l^l^l^'l)) . (23) 



Since 7 G (0, 1) and W{x,x') = el^l^l^'l, ^ implies clearly (HH) and The 
proof is completed. □ 

Proposition 17. // |a-|-6| V \a\ V |6| < 1, the Markov kernel Q admits a unique 
invariant probability measure. Moreover, ttV < 00 where V{x) — e'^'. 

Remark 18. l!Fokianos and Ti0stheim . 2011, Lemma 2.1) have obtained that 
the Log-linear Poisson autoregression is close to a "perturbed" ergodic Log-linear 
Poisson process in the case where a'^ +b^ < 1 if a and b have different signs and 
|a-|-&| < 1 otherwise. In both cases, we have \a-\-b\ < 1. In fact, if -\-b^ <1, 
then la| V|6| < 1. Combining it with the fact that aAb < a+b < aWb when a and b 
h ave different signs, we obtain \a+ b\ < 1. Our conditions thus extends conditions 
of Fokianos and TieistheirA ^201 A) and the r e sults obtained here address an open 
question raised in llFokianos and Tj0stheirr\ . 2011 . page 566). 

Proof. According to [Theorem 81 and ILemma 16[ it is enough to show (A[T][2]). 
We consider first (A[T]). As above, Xi[x) = d-\- ax -\- 61n(l + A^(e'^)) converges 
weakly to Xi{x') as x — > x' . Therefore, Q is weakly Feller. Moreover, following 
the lines of lLemma 16| it can be readily checked that the function V{x) — c'^ 
satisfies: 

QV{x) <el''l (l+4e'^(l^l)) , 
where 7 = |a + fe| V |a| V \b\ < 1. Thus, 

QV{x) < XV{x) + 13 , (24) 
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for some constants (A,/3) G (0, 1) x K+ showing (i^. Consider now x ~ d/{l — 
a). Let x G M and let C be an open set containing x. Then, by setting xq = x 
and for all fc > 1, a;fc = d + axk-i, we have lini„_>oo •X^Yl — SO that there exists 
some n such that for all k > n, x^ ^ C . For such n, we have 

Q"(x, C) = P^^ (X„ e C) > P^^ (X„ e Fi = . . . = y„ = 0) 

= p^jFi = . . . = y„ = 0) > . 

so that (.A[2]) holds. Since (p4|) holds for the function T^(a;) = e^^L ILemma 91 
shows that nV < oo. The proof follows. □ 



2. Consistency of the MEtximum Likelihood Estimator 

2.1. Misspecified models 

Let (6,d) be a compact metric set of M.P, let iJ be a Markov kernel from 
(X, A') to {y,y) and let {{x,y) fy{x) , £ 9} be a family of measurable 
functions from {X xY,X ®y) to {X,X). Assume that all a; £ X, H{x;-) is 
dominated by some a-finite measure /i on (Y, 3^) and denote by h(x; •) its Radon- 
Nikodym derivative: h{x;y) — dH{x; ■) /dfi{y). Assume that h{x;y) > for all 
{x,y) e X X Y and that the sequence of random variables {{Xk,Yk) ; fc £ N} 
satisfy the following recursions 

Yk+i\Tk^H{Xk;-) , 

Xk+i = fl^,{Xk) , (25) 

where Tk is either cr(Xo:fc, i^oife) or a{X^oc:k,Y-oo:k), depending whether the 
process is defined on N or Z. Then, the distribution of (Yi, . . . , F„) conditionally 
on Xq = X has a density with respect to the product measure /i®" given by 

n 

yi:n^Y[h{f{yi.,k-i){x);yk), (26) 

k=l 

where we have used the convention f^{yi;o){x) — x and the notations 

f{ys:t)=fy,of^^_^O...ofl, S<t. (27) 

In this section, we study the asymptotic properties of 9n^x, the conditional 
Maximum Likelihood Estimator (MLE) of the parameter 9 based on the ob- 
servations (Yi,...,F„) and associated to the parametric family of likelihood 
functions given in ()26p . that is, we consider 

{Yl:n) , (28) 

where 

C(yi:n) :=^"'in(^nM/'(j/i:fe-i)(2:);yfc)j . (29) 
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We are especially interested here in inference for misspecified models, that is, 
we do not assume that the distribution of the observations belongs to the set 
of distributions where the maximization occurs. In particular, (l^)nGZ are not 
necessarily the observation process associated to the recursion . 
Consider the following assumptions: 

(Bl) {Yn}nez is a strict-sense stationary and ergodic stochastic process 

Under (B[l|), denote by the distribution of {y„}„ez on (Y^,^^). Write 
the associated expectation. 

(B2) For all {x,y) G X x Y, the functions 9 i-> fy{x) and v i-> h{v,y) are contin- 



(B3) There exists a family of P*-a.s. finite random variables 

{f{Y-oo:k) : eexZ} 
such that for all a; G X, 

(i) lim,„^ooSupegerf(/'(>'-m:o)(-T),/'(>"-oo:o)) =0, P^-a.s., 

(ii) P*-a.s., 

lim snp\\nh{f{Yi.,k-i){x):Yk)-\nhif(Y_^.,k-i);Yk)\=0, 
k^oo 

(iii) supg^Q{lnh{f{Y^^.,k-i)-Yk))^ <oo 
In the following, we set for all {6, fc) £ 9 x N, 

e'{Y^oo:k) lnhif{Y^oo:k-i);Yk) . (30) 

Remark 19. When checking (B[3]), we usually introduce /^(i'-ooio) by showing 
that for all {0,x) € x X, f^{Y^m:~i){x) converges, P^^-a.s., as m goes to 
infinity and that the limit does not depend on x. We can therefore denote by 
{Y-oo:o) this limit. With this definition, we then check (El5))|H)- (pH) - |iii|) . 

Remark 20. When the observation process is integer-valued, the function y — >■ 
h{x; y) is a probability and thus, is less than one. It implies that for all & Q, 

{F{Y^o.:o))+ = (in/i(/(r_oo:-i);ro))+ = . 

Thus, fE(5|)- (pIil) is satisfied. 

Note that under (E(2]), On.x is well-defined. The following theorem establishes 
the consistency of the sequence of estimators {9n.x , n G N}. 

Theorem 21. Assume (B[T][3]). Then, for all x €X, 

lim d(0„,,,e*) = 0, P,-a.s. 

n— >oo 

where 9* := argmaxggeE(£''(y_oo:o)). 
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Proof. The proof directly follows from ITheorem 351 provided that 

(a) E4supege(^''(^-oc:o))+] < oo, 

(b) P*-a.s., the function 6 (Y^oo-.o) is upper-semicontinuous, 

(c) linin^oo supege |Lfj .^(Yi:„) - ^^{Y^oo:n) \ = 0, P*-a.s. where 

71 

fe=l 

But (jaj) follows from (|b| follows by combining (|i|) and (E(2]) since a uniform 
limit of continuous functions is continuous and (jcj) is direct from and the 
definitions of x^X'^-n) and U^^iY^ao-.n) ■ The proof is completed. □ 

We end this section by providing a practical condition for checking the as- 
sumption (B|3l) when x ^ fy{x) is Lipshitz. 

Lemma 22. Assume that there exists a measurable function g : Y ^ such 
that for all (9, y, x, x') G 8 x Y x X^, 

d{f^{x),f^{x'))<g{y)d{x,x'). 

Moreover, assume that for all a; G X, 



sup ln+ d{x,fP^^{x)) 



< oo, E*(ln+£)(y)) < oo, andE^(\ng(Y)) < 0. 



Then, assumption (BlSj-Q holds. 
Proof. We have for all m > 0, 

m 

d{f'{Y-m:^){x), f' {Y^m:a){y)) < d{x, y) J] Q{Y-t) (31) 
Taking y — we obtain 

m 

d{f'{Y^m:o){x)j'{Y^m-i:o){x)) < d{x J^__^{x)) l[[g{Y^,)] 

t.=Q 

Now, since Eg^ (In g(yo)) < 0, limsup^_^oc (n"lo IsO^-e)])^^"^ < 1 and lLemma 36 | 
implies that 

l/m 



limsup I sup d(x, /y_^_j (a;)) J < 1. 

By the Cauchy root test, the series J^^^Peee'^if^ 0^-^n:o){x), f^ {Y-m+i;()){x)) 
is convergent. This implies that lim„i_^oo f^ (Y-m:o){x) exists, P^-a.s. which does 
not depend on x by ((3T|) . This limit is denoted /^(F_oo:o)- The convergence of 
the series also implies 

lim supd(/(r_™:o)(a^),/'(>^-oo:o)) = , P^-a.s. 
SO that (EdD-dil) holds. □ 
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2.2. Well-specified models 

In this section, we focus on well-specified models, that is, {i^n}neN are as- 
sumed to be the observation process of a model defined by the recursions (|25p 
with 6* = 6'* G 8. In well-specified models, we stress the dependence in 6*^ by 
using the notations 

P"* P^ , E^* E* . (32) 

According to [Section 11 to obtain (B[T]), we only need to check that, for all G 8, 
(A iniH)) hold with / = If in addition, we assume that (EdED hold and that 
8* {9^,}, then. [Theorem 21l vields: for all x e X, 



lim 9n,x 

n— f oo 



We now give conditions for having 8* = {^*}. 

Proposition 23. Let {{Xk,Yk) , fc G Z} be a stationary stochastic process in- 
dexed by Z which satisfies the recursions (j25p for some 6 = 9^, <E Q with !Fk = 
fT(X^, ; £ < G Z). Assume that (B[T1[3|) hold and that Xq = /^*(y_oo:o) 
then, H{Xq; ■) is the distribution ofYi conditionally on aiYg ;f < 0). // in ad- 
dition, 

(a) x^H{x;-) is one-to-one, i.e., if H(x; ■) — H(x' , ■), then x = x' , 
(h) f*{Y-^..a) - f{Y-oo:Q), f^'-a.s., implies that 6 = 6,, 
then 8* = {&*}■ 

Remark 24. Condition \Proposition ^iS[ - (|bt is similar as t Davis and TiH. \20li . 

Assumption (A5)). For the sake of clarity, we present here a self-contained proof 
for proving under these conditions that 8* = {^?*}. 

Proof. For all A G A", 

E«* [1a{Yi) I r_oo:0] = E^* [E'^* [tA{Yl) I Xo,y-oo:o] I i^-oo:0] 

= E«* [¥F'[1a{Y^)\Xo]\Y^oo:^] =¥F^ [1a{Y^)\Xo]= H{Xo-A) , (33) 

where we have used that is (j{Yi , t < 0) -measurable. This concludes the first 
part of |Proposition 23"] Now, for all ^^ G 8, 



' M/''(^-oo:o);ri) 



E'^ 



, fe(/'*(y-oo:o);yi) 

hU'{Y-oo:0):Y,) 



oo:0 



(34) 



Under the stated assumptions, H{Xo; •) = H{f^* (Y-od-.o)', •) and ([55]) shows that 
F(/^*(F_oo:o); •) = P^* [• I ^-oo:o]- Therefore, the RHS of ^ is nonnegative as 
the expectation of a conditional Kullback-Leibler divergence. This shows that 
0* G 8* = argmaxgggE^* (in /i(/*(y_oo:o); ^i)) • Assume now that 9 G 8*. 
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Then, according to ([M)) . P^*-a.s., the probability measures H{f^* {Y^od-.o}', ■) 
and i?(/^(y_oo:o); •) are equal, so that under (jaj), 

f*{Y-oo:o) = f{Y-oo:o) , P'*-a.s. (35) 
Under (|b|, this implies that 9 = 9-^. □ 

Examples 

2.3.1. The Poisson threshold in misspecified models 

Let K he a compact set of and let Q be the following (compact) set of 
parameters 

e = {9= {io,a,b,c,d) G K : 

min(w, a,b,a + c,b + d) > a, aV {a + c) < a < 1} . (36) 

where (a, a) G (0, oo) x (0, 1). Assume that the observations {Yn)nez are integer- 
valued and satisfy the following assumptions: 

(CI) {Yn}nez is a strict-sense stationary and ergodic stochastic process 

(C2) E4\n{l + Yo)] < oo . 



The Poisson threshold autoregression model described in Example 3 may be 
rewritten as in (l25t . by setting X = [a, oo), Y = N and 



fyix) =uj + ax + by+ (ex + dy)l{y i (L, C/)} , (37) 
h{x; y) = (y) = exp(-a;)x Vy! , (38) 

9 — {lu, a, 6, c, d) , 

where /z is the counting measure on N. Note that fy{x) = uj + a^[y)x + b^{y) 
where a^{y) = a + cl{y ^ (L, U)} and &^(y) = by + dyl{y ^ {L, U)} so that for 
ah (6',y) e e X Y, [/^(x) - fyix')\ < d\x - x'\. Moreover, using we have 
for all s < t, 

t t-s j-1 

f{ys:t){x) ^x\{a%y,) + + 6^(2/t-,)] J] ^'ivt-f) ■ (39) 

With these definitions, let 9n,x be the Maximum Likelihood estimator associated 
to the likelihood function L^^(yi:„) as defined in (gS]) and 

Theorem 25. Assume (C1T][5])- Then, for all a; G X, lim„^oo d(6'„_a;, 0^) = 0, 
Vg^-a.s. where 

:= argmaxggeE, (^"^(r_oo:o)) , (40) 
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where £^ (Y^oo-.o) , defined in (1301) . can be written as: 

i'{Y-oo:o) = Yo\n{f{Y-oc:-i)) - f{Y-oo:-i) - Inlc,! (41) 

/^(y_oo:„) = Y}'^ + ^'i^n-j)] n , y{n,e) G z X e . (42) 

Proof. According to ITheorem 211 it is sufficient to check (B(2]ll]). (B(5]) clearly 
holds. Assumption (C15]) allows to applv lLemma 22l so that (E(3])-([iil) holds. Using 
IRemark 201 shows that Assumption (El5|l- |iii|) is satisfied. It remains to check 
(EED-dil). By (I3H1), for aU {x,x') G [a.oof and y G Y, 

\\^h{x-y)-\nh{x'-y)\<(y sup H^ifci^ + 1 ) |x - 



Thus, 



< {y/a+l) \x - x'\ 



sup I In h{f (Yi:fe_i ) {x) ; Ffe) - In h{f {Y^oo-.k-i) ; I 
eee 

< (n./a + l)sup|/(yi:fc-i)(a:) -/(r_oo:fc-i)| 



(Yfe/a + l)sup 



fc-i fc-i 



x\{a\Y,) + f{Y^^..^)\{a<>{Y.,) 

£=1 i=0 

< iYk/a + l)\x + af{Y^oo:o)\a''-^ , 
which converges to as fc goes to infinity by applving lLemma 36l under (C|2]). □ 

2.4- The Poisson threshold in well-specified models 

Let K he & compact set of and let Q be the following (compact) set of 
parameters 

Q = {e = {uj,a,b,c,d) G K : 

min(a;, a,b,a + c,b + d) > a, {a + b + c + d) \/ a < a < 1} . (43) 

where (a, a) G (0,oo) x (0,1). We assume that (Y^) is the observation pro- 
cess of Poisson threshold model as described in [Example 3| with (w, a, b, c, d) = 
(w*,a^,6*,c*,(i^) = 9i,. 

Proposition 26. Assume that 0^, G 8 and that (L, [/) fl N ^ 0. Then, for all 
X eX, lini„^oo dn.x = d*, P^*-a.s.. 

Proof. Let {{Xk, Yk) , fc G Z} satisfying the recursions given by [Example 3| with 
{to, a, b, c, d) = (w*, a^, 6*, c*, d^) = 6*^. [Proposition 14| shows that (ClT]|2]) hold so 
that ITheorem 25] applies. It thus remains to show that 9* ~ {(^*}- This follows 
from [Proposition 23[ provided we show that 
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(a) Xo = /«*(F-oo:o) 

(b) H{x; ■) = H{x'; ■) implies that x — x' , 

(c) /^*(K_oo:o) = /''(i"-oo:0>, P'*-a.s., implies that 9 = 9^, 
Define 

a^y) = a, + c,l{y ^ {L, U)} , K{y) = Ky + d^yl{y i (L, [/)} , 
First note that for all m > 0, 

m i 

By applying ILemma 361 and [Proposition 14| , we have 



X_„ Y[ a,(y,) <a"+iX_™ ^„^oo 0, P«*-a.s. 

SO that 

Xo^ \im f*{Y^rn:0){X-rn)=f*{Y-oo:0), F^*-a.S. (44) 

where f^* (Y-ao-.o) is defined in (|42p . Thus, (jsj) holds. (|b| also clearly holds 
since -ff(a:, •) is a Poisson distribution of parameter x. It remains to check (jg). 
If P^*-a.s., /^*(F-oo:o) = /^(^-oo:o), then, by stationarity of the {Yn}nez, we 
have: for all t <E Xt ^ X[ , P''*-a.s. where we set X'^ := f{Y_oo-t)- This 
implies that X^ = f^^ o /|.^_^ (X,'_2) = ° /^,^,(^t-2), P'*-a.s., so that, 



+ a\Yt) [lo + a\Yt-i)Xt-2 + ^'(rt-i)] + b\Yt) 

= Lo, + a^{Yt) [lo^ + a^{Yt-i)Xt-2 + b.{Yt-i)] + K{Yt) , P'^-a.s. 

Since [{Yt-i,Yt) = {k,l)\Y_^.,t-2] ^ 0, for ah {k,i) e and Xt_2 is 
(y'^{Yi,i < t - 2)-measurable, we obtain that, for all (fc,£) e N^, P^*-a.s., 

+ a\k) [io + a^'WXt^s + b^'ii)] + b\k) 

= oj^ + a^ik) [u^ + a^{£)Xt-2 + b^{i)] + 

Fix £ gN. Then, recalling that a^(fc) is bounded in k and that b^{k) ^k-^oo bk, 
we obtain that b = b^,. Fix now fc G N and take the equivalent of the previous 
equation as i goes to infinity, we then obtain a^{k)b£ = ai,{k)bi,£ for all /c G N 
which can also be written as 

so that a — Ui, and c = c^, by using that (i, J7) fl N ^ 0. Finally, using b = b^,, 
a = and c = c*, we have P**-a.s. for all fc e N, 

u) + a*(fc)Xt_i + bi,k + dfclfc^(i = cj* + a*(fc)Xt_i + bi,k + d*A:lfc^(L,t/) 
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so that 

which again impUes that w = and d = di, since (L, [/) n N ^ 0. 



□ 



2.5. The Log-linear Poisson autoregression in misspecified Models 
Let Q be the foUowing (compact) set of parameters 

e = je* = {d, a, b) eM.^ : \d\ < d, |a| < 5 < 1, |6| < . (45) 

where d, a, b are positive constants of R. Assume that the observations {Yn)nez 
are integer- valued and satisfy the assumptions (C[T][2]). The Log-hnear Poisson 
autoregression model described in pT|) may be rewritten as in (pS)) . by setting 
X = R, Y = N, 6' = (d,a,6),and 

f^{x) ^ d + ax + b\n{l + y) , (46) 

h{x; y) = ^^^^ (y) = exp(-e-)e-V2/! , (47) 

where /i is the counting measure on N. Using (1271) . we have for all s < 

f{ys:t){x) = rf^^^ + + 6^ a-'" ln(l + yt-j) . (48) 

With these definitions, let dn.x be the Maximum Likelihood estimator associated 
to the likelihood function L^^(Yi:„) as defined in (gS]) and 

Theorem 27. Assume {C^H^. Then, for all x G X, lim„^oo cl(6'„_a;, 6*) = 0, 
¥g^-a.s., where 

e. argmaxggelE* (Yof{Y^ao:-i) - e-^'^^— - InFol) , (49) 

, OO 

f{Y.^.,^}:=j— + bY,ann{l + Yn-,), V(n, ^^) G Z x 6 . (50) 

Proof. According to [Theorem 211 it is sufficient to check (B[2][3]) . (B[2]) clearly 
holds. Using [Remark 20l since Y = N, we only need to check (B(3])-(Iil) and (E(3])- 
((n]). First note that 

OO 

sup|/(r_oo:o)| <rf7(l-a) + ^y a-''ln(l + >"-j) < OO, P,-a.s. (51) 
which is finite according to {C^ by using ILemma 36l Now, write for all 9 = 
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(d, a, 6) e e, 



d 



im+l 



1 - a 
d 

1 - a ' 



c + fe^a^ ln(l + y_ 



oo 



bJ2a^ ln(l + y_„_i_£ 



£=0 



By (C|2]) and by applving lLemma 36[ the right-hand side (which does not depend 
on 6) converges to as m goes to infinity. Thus, ^ holds. We now turn to 

snv\lnh{f{Yi.,k-i){x);Yk)-i''{Y_^.,k)\ < 
see 



(52) 



rfcSUp|/(yi:fc_l)(x) -/(r_oo:fc-l)| +SUp 



Consider the first term in the rhs. It follows immediately from (|48p and (|50p 
that 

/(yi:fc-i)(a;) - - (x - /''(r_oo:o)) • (53) 

This implies that, for all fc > 1, 

Yk sup \f{Yl.,k~l){x) - f {Y^oo:k-l)\ < Yk h^'-^X + sup |/^y-oo:0)|) , 

which converges P-a.s. to as fc goes to infinity according to (|5ip and by applying 
ILemma 36l under {(^. Moreover, ([55)) also implies that 



so that the second term of the rhs of (|52t is bounded according to 



sup 

dee 



{Yo:U-l){x) _ ^f{Y^^:k-l) 



< sup 



( \x - r(r-oo.o)| el^-/'<^— X a^-i supel/'<^— 
^ ^ see 



To complete the proof, it is thus sufficient to show that 



lim a'=exp<^ sup|/''(r_oo:fc-i)| > =0, Pe,-a.s 



dee 



But this is straightforward by applying ILemma 36l since by (|5T|) and by setting 
Vk := exp{sup(,g0 \f^ {Y^oo:k-i)\} , we have 



[(lnV^i)_ 



1 - a 

which is finite by {C^- 



< ^+5Ea^E.[ln(l + y_,)] - 1+ ^^^[^1 + ^o)] 

3=0 



1 - a 



□ 
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2.6. Log-linear Poisson autoregression in well-specified models 
Let O be the following (compact) set of parameters 

e = {{d,a,b) eM.^; \d\<d, |a + 6| V |a| V |6| < 5 < 1} . (54) 

where {d, a) G M+ x (0, 1). We assmne that {F„}„gz is the observation process of 
the Log-linear Poisson autoregression model described in (jl7l) with parameters 
(d, a, b) = (d*, a*, fe*) = 9^,. 

Proposition 28. Assume that 6** e 6. Then, for all x G X, 

lim0„,, = 0,, P'^*-a.s. 

Proof. Let {(Xi:,!^), fc e Z} satisfying ([T7)) with {d,a,b) — ((i*,a*,&^) = 
[Proposition 17| shows that (C[T][2) hold so that [Theorem 27l applies. It thus re- 
mains to show that 0* = {9*}- This follows from [Proposition 23| provided we 
show that 

(a) Xo = /«*(r-oo:o),F'*-a.s., 

(b) H{x; ■) = H{x'; •) implies that a; = a;', 

(c) f'{Y^^.,o) = f{Y-oo:o), P«*-a.s., implies that 9 = 0., 
First note that for all to > 0, 

1 _ m+l ™ 

By applying ILemma 361 and [Proposition 17[ we have limm_j.oo a^'^^X^^ = 
, P^*-a.s., so that 

= lim /''*(y_,„:o)(^-r„) =/'*(>"-oc:o), P'*-a.s. (55) 

where /**(F_oo:o) is defined in (ISUl) . Thus, (jlj) holds. (|b]) also clearly holds since 
H{x,-) is a Poisson distribution of parameter e^. It remains to check (jcj). If 
P»*-a.s., /«*(r_oo:o) = /'(i^-oo:o), then, by definition of f{Y_o.:o), 

— i — + V(6^ai - ba^') ln(l + y_,) = , P«*-a.s. 

Conditionally on a{Ym', rn < —l),Yo is a Poisson random variable with a positive 
intensity; thus, the Ihs is constant only if b^ = b. This implies that 

7 rl °° 

- + bY,K - a^) ln(l + y-j) = , P''*-a.s. 

By the same argument, the Ihs is constant conditionally on (7{Y„i; ni < —2) only 
if a* = a. In that case, the previous equality writes: di, — d = which completes 
the proof. □ 
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3. Proofs of ITheorem 8L ILemma~9] and ILemma 111 



The proof roughly foUows the hnes of iHenderson et alj (|2011l ) with the dif- 
ference that we relax the Lipshitz assumption and introduce a drift function W. 
In all this section, (X, d) is a Polish (complete, separable and metric) space and 
denote by X its associated Borel cr-field. A totally separating system of metrics 
{dn , n G N} for X is a set of metrics such that for all fixed x, x' £ X, the se- 
quence {dn{x, x') , n G N} is nondecreasing in n and lim„^oo dnix, x') ~ Ix^x'- 
A metric d on X induces a Wasserstein distance between probability measures 
/ii and /i2 on (X, X) defined by: 

IImi - M2||d = inf |y /i(da;, da;')d(a;,a;') : G 7W(/ii,/i2)| , (56) 

where 7W(/ii, ^12) is the set of probability measures /i on (X^, A"®^) such that 

^i{A X X) = Hi{A) , /^(X X A) = ^2(A) , for ah A e A". 

Since X is a separable met ric space, the K antorowich- Rubinstein duality theorem 
applies (see for example ( Dudley! . l2002l Theorem 11.8.2)) and we have 

\\^ll-^l2\\d^snp{^ll{f)-^l2{f) Lip(/;d)<l} , (57) 

where 

T- ( \fix)-f{x')\ 

Lip(/ ; d) = sup < — — : x,x G X , x x 

Re call the definition of an asym ptotically strong Feller kernel, first introduced 
bv lHairer and Mattingl-^ (|2006[ ): 

Definition 29. A Markov kernel Q is asymptotically strong Feller if, for all 
x € X, there exist a totally separating system of metrics {dn,n G N} for X and 
a sequence of integers {tn,n G N} such that 

limlimsup sup ||(5*"(x, •) - Q*"(x', •)||d„ = . 

where B(x,7) is the open ball of radius 7 with respect to d and centered at x. 

The following theorem is taken from Hairer and Mattinglvl ( 20061 ) and pro- 
vide conditions for obtaining uniqueness of the invariant probability measure. 

Theorem 30. Assume that the Markov kernel Q is asymptotically strong Feller 
and admits a reachable point a; G X. Then, Q has at most one stationary distri- 
bution. 

Proof of lTheorem 81 Under (A[T]), (jTweedid . Il988l Theorem 2) show that Q 
admits at least one stationary distribution. Since by (A[2|) Q admits a reachable 
point, we conclude by applying ITheorem 301 provided that we can prove Q is 
asymptotically strong Feller. Denote 

T = inf {i G N : = 0} (58) 

with the convention inf = 00. We preface the proof by the following technical 
lemma: 
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Lemma 31. Let {{Xk, X'^.,Uk) ,k € N} be a Markov chain on (X? x {0,1}) 
with Markov kernel Q, introduced in (-A[3]). Then, for all real-valued nonnegative 
measurable function (p on , n G N* , x, x' ^ X and u G [0, 1], 



n-1 



i=0 



(59) 



where and a are introduced in (-A[3]), Xf := {Xi, X^) and B{u) is the Bernoulli 
distribution with parameter u. 



Proof. The proof is by induction. Note first that (l59l) obviously holds for n = 1. 
Now, assume that ([5^ holds for some n > 1. Then, noting that \{T>n+i} = 
]Ti=i Ui (where T is defined in ([Sg])) 



ViXl+i)l{T>n+l} 



n+1 



S^<S>S^,<»B(u) 



E 



S^®S^,®B(u) 



¥P{Un+MXi+^)\Xl Un) n U^ 
n 

a{Xl) Q^^{Xi)\{U. 



where the last equality follows from ([S]). Applying the induction assumption to 
the right-hand side of the inequality, we obtain 



■ S^<»S^,®B{u) 



>"+!} 



n-1 



i=0 



i=0 



a{Xf,)EQ\p{xl^,)\Xl)Y[a{xf) 

n 

p(xi^^)X{a{X\ 



The proof is completed. 



□ 



Now, consider dn{x, x') — lA[nd{x, x')]. Obviously, for all fixed x, x' G X, the 
sequence {dn{x,x'), n e N} is nondecreasing and limn^oo dn{x, x') = l{a; 
x'} so that {dn , n G N} is a totally separating system of metrics. Moreover, the 
Kantorovich- Rubinstein duality theorem ()56p , ()57p and the marginal conditions 

yield: for aU {x,x',u) e X x B{x,-fr,) x [0, 1], 

\\S,Q- - <5,,Q"|U„ < E£^,^,^g(„j(d„(X„,x;)) 

< ^l<^s^,^Biu)iT <n)+ E£^,^,^g(„)(d„(X„,X;)l{r > n}) , (60) 
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where we have used that dn{x, y) < 1. First consider the second term of the right- 
hand side. Applying ILemnia 311 combined with dn{x, y) < nd{x, y), a(x, x') < 1 
for all {x,x') £ and ©, yields 



n-1 



i=0 



:(„,K(X„,X;)1{T > n}) < nEl^,^^ 

<nDp"d{x,x') . (61) 
We now turn to the first term of the right-hand side of (|60p . By ([7]) , we get 

n-l 

ji-i 

<llK^K>®Biu)\nT > k}d{X^,X',)W{Xu,Xl)] 

k=0 



Applying ILemma 3l1 to the right-hand side , combined with a < 1 and ^ , we 
obtain 

k-l 

d{Xk,X',)W{X,,X'^)Y[a{X,,Xl) 

i=Q 

r.,r r l.r.rC , Dd^Ux , x')W^^ (x , x') 

< Dd^^ ix, x')W'^^ ix, x') y < ^-^—^ — — - . 

fe=0 ^ 

Plugging this and (IFTI) into (l60l) yields: for all x £ X and all x' e B(a::,7) where 
7 < Ix, 

||<5.g"-<5,-g"|U„ + sup W^-ix,y)/il-p) 

where jx is defined in (ITUl) . Thus, for all a; G X, 

limlimsup sup ||Q"(x, •) - OIU,. = . 

7->-0 „_).oo £!;'eB(a;,7) 

The proof is completed. □ 

Proof of lLemma !A Since for all M > 0, the function x ^-^ x A M is concave, we 
have for all n G N, 

Q"(V" A M) < {Q"V) AM < [A"F + 6/(1 - A)] A M . 

By integrating with respect to tt, we obtain that 

tt{V a M) = nQ'^iV A M) < tt {[A"F + /3/(l - A)] A M} . 



(T<n 
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The Lebesgue convergence theorem yields by letting n goes to infinity 

7r(y A M) < - A) A M . 
The proof follows by letting M goes to infinity. 



□ 



Appendix A. Ergodicity of one-sided and two-sided sequences 

Let (X, X) be a measurable space. Denote by S : and S : X^ ^- X^ 

the shift operators defined by: for all x = {xt)tm G X'^ and all x = (5t)tez € X^, 

S(x) (yt)(gN, where yt = Xt+i, e N , (A.l) 
S(x) = (yt)t6z, where yt = Xt+i, G Z . (A. 2) 

Note that S is invertible while S is not. Let P be a Markov kernel on (X, A'). 
Denote by ¥^ the probability induced on (X^, A'®'^) by a Markov chain of initial 
distribution jj, and Markov kernel P and write the associated expectation 
operator. If = tt is an invariant distribution for P, we can define a probability 
induced on (X^,A"®^) by the Markov kernel P and initial distribution tt. 
Similarly, we write the associated expectation operator. Moreover, P^r ex- 
tends P^ on Z in the sense that for all A e A®^, P„(^) = P^(X^- x A), which 
can also be written as 

P^=P^op-i, (A.3) 
where p is the mapping from (X^, A"®^) to (X^, Af^^) defined by 

p{u)) = (w„)„6N where co = (a;„)„ez • (A.4) 
Define for all fc G N, Afc : X"^ -)> X by 

Xk{uj)=uJk, where co = {we)e^jq £ 
and similarly, define for all G Z, X/t : X^ — >■ X by 

Xk{oj) = Wk , where w = {uJe)eez G X^ . (A.5) 

Recall that (12, J^, P, r) is a measure-preserving dynamical system if (fi, J", P) is a 
probability space and r : O — >■ f2 is measurable such that Por^^ = P. Moreover, 
a measure-preserving dynamical system (O, J", P, r) is said to be ergodic if for 
all invariant subset A G J^, i.e. 1a = 'i-A ° S, we have F{A) = or 1. Recall 
that if Ig = 1b o S , P-a.s., then, there exists an invariant set A such that 
1^ = 1b , P-a.s. In the following, r*^ : is the mapping r iterated k 

times, that is r'^ = r o . . . o r and by convention t^{uj) = u) for all w G fl. 

Theorem 32. Assume that the Markov kernel P has a unique stationary dis- 
tribution -k. Then, the dynamical system (X^, A®^,P^,S) is ergodic. 



26 



Proof. Let A € X®^ be an invariant set for (X^, A'®'^, P^, S), that is: 1^ = 
1^ o S. We will show that PttI^) = or 1 by contradiction. Assume indeed that 
PttC^) G (0, 1). Using the Markov property and the fact that A is invariant, 

ExJIa) -E^ [IaoS'^I J-fc] =E^[lA\Tk] , F^-a.s. 

where Tk — ct(Xo, . . . , Xk). Therefore, {(Ex,. ^k),k e N} is a uniformly in- 
tegrable martingale. By ( Hall and Hevdel . [l980l Corollary 2.2), limfc^^oo (1a) =| 
1a, P^-a.s. and lim^^oo EvrlEx^ (1a) - 1a| = 0. Then, 



E,(|1a - Exo(Ia)I) = E,(|1a - Exo(1a)| o S'^) 

= E,(|1a -Ex,(1a)|) - hm E,(|1a - Ex,(1a)|) = . 

k^OG 

so that 1a = FxoiA), P^-a.s. Setting 

rA:={xeX, P,(A) = 1}, (A.6) 

we then obtain l^i = lr^{Xo), Pjr-a.s. Combining it with the fact that A is 
invariant, we get for all fc G N, 

1a = 1a o S'= = Ir^ {Xo o S^) = Ir^ (Xk) , F.-a.s. (A.7) 

Now, let 7rA(-) = Q;"^7r(rA H •) where a = P^(^) ^ 0. By definition of tta 
and by using (|A.7p with k = Q and fc = 1, we get for all i? G A", 



(Xi G B) = a-^P^{{X, eB}n {Xo G Ta}) 

= a-^v^{{x, e B} n {X, eTA}) 

= a-ip,(Xi G B n Ta) - a-^TT{B n Fa) = ttaIB) , 

showing that tta is a stationary distribution for the Markov kernel Q. Since A 
is an invariant set, A"^ is also an invariant set and thus, tta" is also a stationary 
distribution for the Markov kernel Q. Since by assumption there exists a unique 
stationary distribution, we have that tta = tta" which is not possible since 
these probability measures have disjoint supports (indeed by (|A.6p . we have 

r^nr^e = 0). □ 

Theorem 33. Assume that the dynamical system (X^, A"**^, F^r, S) is ergodic. 
Then, the dynamical system (X^, A:"®^, Pjr, S) is ergodic. 

Proof. Let A be an invariant set for the dynamical system (X^, A'®^, P^, S), that 
is 1a = 1a o S. We now show that F^(A) = or 1. 

Note first that A"®^ = a{T-k , fc £ N) where Ti = a{X^ , ^ < i < oo) and 
Xi is defined in (IA.5I) . This allows to apply the approximation Lemma (see for 
example (Grav, 2009, Corollary 1.5.3)) showing that for all e > 0, there exists 
fce G N and a J-'_fc^ -measurable random variable such that E7r(|Ze|) < oo 
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and E^jl^ ~ -^el ^ Then, setting — o S " £ J-q and using that A is an 
invariant set, we obtain 

K\1a -Ye\=E^\lAoS'' -Z,oS''\^E^\lA~Z,\<e. 

The positive real number e being arbitrary, there exists Y such that E7r|y| < oo 
and 1^ = Y, P^-a.s. which imphes that 1 = P^(l^ = ^) < ^7t{Y G {0, 1}) < 1. 
Thus, there exists B £ such that 

1b=Y = 1a. P^-a.s. (A.8) 

Eq. (jA.Sp and the invariance of A then shows that 

Now, note that = aijj) where p is defined in (|A.4p . Then, since B £ Fq, 
there exists C S A"^^ such that B = p^^{C) and thus, 

1 = F^Ib = Is o S) = P„(1M-) e C} = l{p o S(-) e C}) 
= P^(lcop= IcopoS) 

'=^P.(lcop= IcoSop) =P, op-i(lc = IcoS) =^P^(lc = IcoS) , 

where = foUows from p o S = S op and = from P^r = P^ o (see (lA.Sp V The 
dynamical system (X'^, A"®^, P^, S) being ergodic, it implies that P,r(C) = or 
1 which concludes the proof since 

P,(C) = P, op-\C) = F^{B) = P,(A) . 

□ 

Proposition 34. Let (X^, A"®^, P, S) &e a measure-preserving dynamical sys- 
tem. Then, the following statements are equivalent: 

(a) (X^, A'®^,P,S) is ergodic. 

(h) for all measurable function /i : X^ — K satisfying E(/i+) < oo, 

n-l 

n-i^/ioS'= ^„^oo IE(/i), P-a.s. (A.9) 

Proof. We first show that (Jaf implies (0. Assume that E(/i+) < oo. If E(/i_) < 
oo, then, (jA.9p follows from Birkhoff's ergodic theorem. If E(/i_) = oo, then 
E(/i) = — oo. Moreover, since for all nonnegative real number M, 

-M < hl{h > -M} < h+ , 
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the monotone convergence theorem apphed to the nondecreasing and nonnega- 
tive function, /i+ — hl{h > —M} yields 

lim E{hl{h > -M}) = E( hm hl{h > -M}) = E{h) = -oo, P-a.s. 

M— >-oo Af— s-oo 

so that E{hl{h > —M}) -^m^oo —oo. The proof follows from 

n-1 

lim sup h o S*^ 

fc=0 

n-1 

<limsupn-i^/ioS''^l(/ioS'' > -M) = E{hl{h> -M}) , 
k=o 

by letting M goes to infinity. Conversely, assume (0. Let A G such that 
l^oS = 1a. Then, 

n-1 
fc=0 

which implies, since P-a.s., 1^08*^ = 1a, 

1a P(^), P-a.s. 

Since 1a takes value in {0, 1}, then necessarily F{A) = or 1. The proof is 
concluded. □ 

Appendix B. Consistency of MsLX-estimators using stationary approx-| 
imations 

Let X be a Polish space equipped with its Borel sigma-field X and let S the 
shift operator as defined in (jA.ip . Assume that (X^, A"®^, P, S) is a measure- 
preserving ergodic dynamical system. Denote by E the expectation operator 
associated to P. 

Let , 9 G Q) be a family of measurable functions £^ -.X^ R, indexed by 
9 G Q where (8, d) is a compact metric space and denote :— n^^ X]fc=o ° 
S*^. Moreover, consider (L^ , n G N*, G 9) a family of upper-semicontinuous 
functions : X^ — > R indexed by n G N* and 6' G 8. Consider the following 
assumptions: 

(C3) E(supe 

(C4) P-a.s., the function 9 ^ P \s upper-semicontinuous, 
(C5) lim„_^oo supege K - L« | =0, P-a.s. 
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Let [On : n e N*} C e and : 71 e N*} C 9 such that for all n > 1, 
9n e argmaxggeL^ , 9n S argmaxg^QL^ . 

Assum ptions (Cl3][4|) are quite standard and can be adapted directly from |Pfanzagl| 
( 1969| ) (which treated the case of independent {Xn}nm)- For the sake of clarity, 



we provide here a short and self-contained proof. 

Theorem 35. Assume (CHg]). 

(i) Then, lim„^oo 0*) = 0, P-a.s. where 9^ := argmaxggQE(f'). 

(a) Assume in addition that (C|5]) holds. Then, lini„_^oo d(0n, 9^,) = 0, P-a.s. 
Moreover, 

lim L^" = supE(/^) , P-a.s. (B.l) 

V6' e 9, lim = E(F) , P-a.s. (B.2) 

n— >oo 

Proof. Proof of First note that according to Proposition [Ml and (C13]), for 

all 6 £ Q, lini„^oo exists P-a.s., and 

n-l 

lim Li = lim n-^ F o S'' ^ P-a.s. (B.3) 

A:=0 

Let K he a compact subset of 9. For all Oq e K, P-a.s., 
lim sup lim sup sup ^ F o S'^ 

p^O n-^oo eeB(0o,p) ^j^o 

< lim sup lim sup n ^ sup o S'' = lim sup E sup , (B.4) 

where the last equality follows from {C^ and Proposition [M] Moreover, by 
the monotone convergence theorem applied to the nonincreasing function p i— >■ 
sup9gB(eo,p)^^ have 



limsupE sup r =E limsup sup f ] <E{n) , (B.5) 
P^o YeeB(eo,p) / \ p^o 0eB{eo,p) J 

where the last inequality follows from (C2]). Combining (jB.4[) and (|B.5p . we 
obtain that for all 77 > and Oq £ K, there exists > satisfying 

n-l 

limsup sup 71"^ V o S*" < E(£^'') + r/ < sup E(Z'') -I- r/ , P-a.s. 
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Since if is a compact subset of 6, we can extract a finite subcover of K from 
U(,„eKB(^?o,p'''), so that 

n-l 

limsup sup V o S*" < sup + -q , P-a.s. (B.6) 

Since rj is arbitrary, we obtain 

limsup sup V F o S*^' < sup E(F) , P-a.s. (B.7) 

k—v 

Moreover, P-a.s., by (|B.5|) . we get 

limsup sup E(£'') < limsup E ( sup ?] <E{F°) , 
P->o eeB(eo,p) p^o yeB(eo,p) / 

This shows that 9 i-^ E(£^) is upper-semicontinuous. As a consequence, 0^, := 
argmaxgg0E(^^) is a closed and nonempty subset of Q and therefore, for all 
e > 0, :— {6 e Q;d{9,Qi,) > e} is a compact subset of Q. Using again 
the upper-semicontinuity oi 9 E(£^), there exists 9^ G -fCc such that for all 

sup E(F) =E(F=) < E(F*) . 

Finally, combining this inequality with (|B.7[) . we obtain that P-a.s., 

n-l 

limsup sup Lf, = limsup sup F o S*" < sup E{i^) 

k—v 

< E(F*) = lim L^* < liminf L^" , (B.8) 

n— >oo n— )-oo 

where (1) follows from (jB.Sp . This inequality ensures that 6n ^ for all n 
larger to some P-a.s. finite integer-valued random variable. This completes the 
proof of (i) since e is arbitrary. A 
Proof of First note that ((R2]) follows from ((B3]) and (C|5]). 
Let be any point in 0^. Then, P-a.s., 

E{f*) = liminf Lf; < liminf L^" < limsup L^" 

n — ^oo n— >oo n^oo 

- (3) 

= lim sup sup Lf^ < supE(F) = E{f*) , 
Ti-^oo eee eee 

where (1) follows from (|B.3|) . (2) is direct from the definition of 9n and (3) is 
obtained by applying (jB.7p with K = Q. Thus, 

Lfr ->n^ooE(/«*), P-a.s. (B.9) 
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Denote 5n ■— supggQ |L^ - LfJ. We get 

-fl S (1) _ (3) _ 

I'n - Sn < I'n < "-n" < L'" + < + <S„ . (B.IO) 

where (1) follows from the definition of Sn, (2) from the definition of 6'„ and (3) 
from the definition of Combining the above inequalities with (jB.Qp and (CO 
yields (|B.1|) . (jB.lOp also implies that 

which yields, using (jB.81) . 

limsup sup Lf, < liminf L^" = limsup L^" = E{F*) , P-a.s. 

where iCc := {0 G 0;d(0, 8^) > e}. Therefore, 6'„ ^ for all n larger to some 
P-a.s.-finite integer-valued random variable. The proof is completed since e is 
arbitrary. < 

□ 

Lemma 36. Let {Vn}n&i be a sequence of strict- sense stationary random vari- 
ables on the same probability space P). Denote by M the associated expec- 
tation operator and assume that E[(ln |Vb|)+] < oo. Then, for all rj € (0, 1), 

lim T]''Vk =0, P-a.s. 
Proof Let G (0, 1). For ah e > 0, 

OO OO 

^P(?7'=|T4| > e) = ^P(ln|Vo| -Ine > -A: In 77) 

where the last inequality follows from E[(ln |Vb|)+] < 00. The proof follows by 
applying the Borel-Cantelli lemma. □ 
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